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ABSTRACT 

This  paper  deals  with  the  problem  of  optimally 
locating  files,  and  their  optimum  number  of 
redundant  copies  in  a  vulnerable  communication 
network.  It  is  assumed  that  each  node  and 
link  of  the  communication  network  can  fail 
independently.  The  optimization  problem  max¬ 
imizes  the  probability  that  a  commander  can 
access  the  subset  of  files  that  he  needs 
while  minimizing  the  network-wide  costs  relat¬ 
ed  to  storage,  query  and  update  communication 
costs.  The  problem  reduces  to  a  linear  zero- 
one  integer  programming  one;  several  theorems 
that  reduce  its  complexity  of  solution  jure 
presented. 

/K 

1.1  INTRODUCTION 

This,  paper  focuses  on  the  problem  of  optimal 
redundant  file  allocation  for  a  very  vul¬ 
nerable  distributed  data  base  system.  This 
file  allocation  is  different  than  previous 
file  allocation  problems  because  it  considers 
the  following  new  items: 

1.  Vulnerability  of  the  nodes  and  links  due 
to  enemy  actions,  e.g.  jamming 

2.  The  importance  of  the  users 

3.  The  importance  of  particular  files  to 
particular  users. 

First  we  shall  discuss  the  motivation  for  the 
problem  of  optimal  file  allocation  in  a  vul¬ 
nerable  environment.  Second  we  shall  explain 
the  problem  and  its  constraints.  Third  we 
shall  discuss  the  possible  trade-off  in  costs. 
Fourth  a  brief  literature  survey  will  be 
presented.  Fifth  the  actual  formulation  will 
be  explained.  Sixth  the  various  theorems 
that  have  been  developed  for  this  formulaticn 
will  be  stated  and  explained  in  words;  how¬ 
ever,  we  do  not  include  the  theorem  proofs. 
Seventh  the  conclusions  and  suggestions  for 
further  research  will  be  presented. 

1.2  MOTIVATION 

The  motivation  behind  our  problem  is  in  the 

C3  (Coenand,  Control,  and  Coenunications) 
context.  In  this  context  we  are  considering 


through  its  organic  sensors.  This  information 
must  somehow  be  stored  and  maintained  to  the 
utmost  correctness  because  the  BG  must  co¬ 
ordinate  its  actions.  The  system  may  be  con¬ 
sidered  as  a  distributed  data  base  system. 

The  ships  and  planes  can  be  considered  as 
nodes,  and  the  communication  channels  between 
the  ships  and  planes  can  be  considered  as 
links.  The  data  can  be  considered  as  files 
stored  in  the  computers  of  the  ships  and 
planes . 

Considering  the  BG  as  a  set  of  nodes, 
links  and  data  files,  we  have  defined  for 
ourselves  a  distributed  data  base  network. 
Since  in  warfare,  ships  cam  be  destroyed  and 
communication  links  jammed,  our  network  is 
vulnerable.  Therefore  we  must  consider  how 
to  maintain  a  consistent  and  complete  data 
base. 

If  we  also  consider  the  individual  war¬ 
fare  commanders  and  the  data  files  they  need, 
the  problem  becomes  more  complex.  He  can 
also  rank  the  importance  of  each  commander 
and  the  importance  of  each  data  file  to  each 
commander  and  include  this  in  our  optimiza¬ 
tion  problem. 

1.3  PROBLEM 

The  problem  is  therefore  as  follows-we 
are  given  the  following: 

1.  A  set  of  M  data  files 

2.  A  set  of  N  nodes  to  store  the  data  files 

3.  The  probabilities  of  any  node  of  link 
being  destroyed  from  which  the  probabil¬ 
ities  of  any  particular  commander  at  one 
node  can  access  any  particular  file  at 
another  node. 

4.  A  set  of  L  eoraaanders. 

5.  The  costs  of  assigning  a  particular  file 
to  any  particular  node. 

6.  The  query  races  for  any  particular  file 
emanating  from  any  particular  node.  The 
query  rate  is  the  rate  at  which  files  are 
requested. 

7.  The  update  rates  for  any  particular  file 
emanating  from  any  particular  node.  The 
update  rate  is  the  rate  at  which  files 
are  updated  (changed) . 


a  Naval  Battle  Group  (BG)  composed  of  air¬ 
craft  carriers,  cruisers,  destroyers,  air- 
craft.  etc.  The  BG  gathers  information 
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We  desire  the  following: 

1.  To  locate  single  or  multiple  copies  of 

_ 
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the  M  files  at  tha  N  nodes  such  that  Che 
files  will  he  accessible  to  Che  ceossandsrs 
who  need  Che  files. 

2.  To  locate  che  files  at  nodes  that  will 
provide  che  least  amount  of  cost.  The 
cose  can  be  related  query  conminlcatlon 
costs,  update  communication  costs  and  file 
storage  costs. 

1.4  TRADE-OFF  OF  COSTS 

There  is  a  trade-off  in  costs  if.  we 
consider  the  following  costs: 

1.  query  communication  costs 

2.  update  coamunication  costs 

3.  storage  costs 

4.  the  cost  to  the  BG  if  a  particular  com¬ 
mander  does  not  have  access  to  tha  par¬ 
ticular  file  he  desires. 

To  minimize  any  one  of  the  four  costs  we  can 
do  the  following: 

1.  To  reduce  the  query  communication  costs, 
we  can  store  more  redundant  copies  of 
each  file  so  that  each  query  can  find  its  . 
file  with  less  coamunication  cost.  This 
is  true  since  we  shall  assume  that  each 
query  goes  to  the  nearest  node  containing 
the  file. 

2.  To  reduce  the  update  caesninication  costs, 
we  can  store  fewer  redundant  copies  of 
each  file  so  that  each  update  will  update 
fewer  files  and  incur  less  coamunication 
costs.  This  is  true  since  we  assume  each 
update  goes  to  all  the  nodes  containing 
the  file. 

3.  To  reduce  the  storage  costs,  we  can  store 
fewer  redundant  copies  of  each  file  so 
the  cost  will  decrease. 

4.  To  reduce  the  cost  of  non-accessibility 
of  particular  files  to  particular  com¬ 
manders,  we  can  store  more  redundant 
copies  of  that  particular  file  so  that  it 
has  a  higher  probability  of  being  able  to 
be  accessed  by  the  particular  commandar. 

The  bottom  line  is:  we  cannot  increase 
and  decrease  the  number  of  redundant  copies 
of  a  particular  file.  He  would  like  to  find 
the  optimal  number  of  redundant  copies  for 
all  the  files  and  where  to  store  them. 

2.  LITERATURE  SURVEY 

The  file  allocation  problem  was  first 
investigated  by  Chu  [1] :  a  global  optimiza¬ 
tion  was  considered,  consisting  in  obtaining 
the  minimum  overall  operating  costs  subject 
to  two  kinds  of  constraints:  first,  the  ex¬ 
pected  time  to  access  each  file  had  to  be 
less  than  a  given  delay,  and  secondly  the 
amount  of  storage  needed  at  each  computer 
had  not  .to  exceed  the  available  storage 
capacity.  The  number  of  copies  of  each  file 
was  assumed  to  be  fixed.  A  generalized  mo¬ 
del  was  defined,  in  which  storage  and  trans¬ 
mission  costs  were  associated  to  file  alloca¬ 
tions:  channel  queues  were  modeled  in  order 
to  introduce  the  constraint  on  the  delay. 

The  resulting  linearized  integer  program  was 


characterized  by  a  very  great  number  of  var¬ 
iables  even  for  application  of  limited  dimen- 
sions:  its  solution  was  extremely  hard  from 
a  conceptual  viewpoint. 

Casey  [2] ,  [?)  considered  the  problem  of 
allocating  single  files  separately,  but  the 
number  of  copies  of  each  file  was  not  assumed 
to  be  fixed.  Communication  costs  and  storage 
costs  of  allocations  were  analyzed  in  order 
to  determine  the  optimal  set  of  nodes  on 
which  the  file  was  to  be  allocated.  The  dif¬ 
ference  between  retrieval  and  update  trans¬ 
actions  was  stressed:  while  retrieval  trans¬ 
actions  are  routed  to  only  one  copy  of  the 
file,  update  transactions  are  routed  to  all 
the  copies,  in  order  to  preserve  consistency 
of  redundant  information.  Under  the  assump¬ 
tion  of  taking  equal  cost  rates  for  retrieval 
and  updates,  theorems  were  given  for  limiting 
the  number  of  replicated  copies  of  the  file 
on  the  basis  of  the  update/query  ratio: 
obviously  the  convenience  of  taking  repli¬ 
cated  copies  decreases  while  the  update/query 
ratio  increases.  Although  the  file  alloca¬ 
tion  problem  was  analyzed  for  each  file 
separately,  Eswaran  [3]  proved  that  Casey's 
formulation  was  NP  complete  and,  therefore, 
suggested  to  investigate  heuristic  approaches. 

Morgan  and  Levin  (4),  examined  both  the 
allocation  of  files  and  transactions  within 
a  generalized,  ARPA-like  network.  They  adop¬ 
ted  the  user's  viewpoint,  assuming  to  be 
under  the  jurisdiction  of  a  network  manage¬ 
ment  providing  services  at  the  market  price. 

Because  of  this  characterization  of  applica¬ 
tion  environment,  storage  capacity  constraints 
were  not  included;  the  provision  of  sufficient 
Storage  was  considered  a  task  of  the  network 
management.  Therefore,  by  introducing  some 
other  simplifying  assumption,  the  authors 
demonstrated  that  the  multiple  file  alloca¬ 
tion  problem  could  be  decomposed  into  in¬ 
dependent  (single)  file  allocation  problems: 
they  also  developed  an  heuristic  solution 
technique . 

Finally  two  contributions  to  the  file 
allocation  problem  have  been  very  recently 
presented.  Ramamoorthy  and  Wah  [5]  analyzed 
a  relational  Distributed  Data  Base;  they  ob¬ 
served  that  the  general  approach  of  query 
processing  optimization  consists  in  the  min¬ 
imization  of  coamunication  costs.  These  caa- 
munication  costs  are  mostly  due  to  data  moves 
.  which  are  necessary  for  providing  the  logical 
correlation,  expressed  by  the  query,  between 
files  stored  on  different  nodes.  A  logical 

operation  which  is  particularly  critical  is  _ 

the  join  operation  between  remote  files;  a  For 


join  between  two  files  can  be  performed  only 
if  the  two  files  are  co-located  at  the  same 
node.  Therefore  the  authors  developed  a 
model  in  which  redundant  files  are  introduced 
in  order  to  avoid  distributed  joins,  on  the 
basis  of  the  frequency  of  queries. 


3 .  NOTATION 


the  available  memory  size  of  the 
computer 

>  the  file  j  is  stored  at  node'i 
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i  given  an  assignment  1 
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0  if  Vk  s.t.  j  at  destroyed 
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the  number  of  redundant  copies  of  file 
j  stored  in  the  system 
the  volume  of  query  traffic  emanating 
froai  node  j  for  file  1 


- 


'ji 

°k3 

Tij" 

*ijk 


the  volume  of  update  traffic  emanating 
from  node  j  for  file  1 
*  the  cost  of  a  unit  of  caenunieation 
from  node  j  to  node  k 

“  the  coat  of  locating  a  copy  of  a  file 
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node 


j  at  the  kc 
1  the  maximum  allowable  query  traffic 

time  of  the  1th  file  to  the  1th  node 

■  the  expected  time  for  the  i  node  to 

query  the  3th  file  from  the  k01  node 
the  set  of  node  indexes  representing  a 
given  assignment  of  file  1. 


4.  FORMULATION 

Since  x^  is  a  zero-one  variable,  the 

sum  over  all  nodes  i  must  be  equal  to  the 
number  of  redundant  copies  of  file  i. 
Therefore  we  have: 


XK1  +  X«  ;  "  +  X*4  bN 


(1-Xil,Xkl*ilk^ilJ 

a-Xi2)Xk2*i2k-Ti2' 

(1-Xi«)XkM4iMk^TiM  ' 
x  >  0 

where  the  first  term  in  the  minimization 
corresponds  to  the  cost  of  updating  file  i 
at  node  k  which  was  requested  by  node  j , 
where  each  node  k  is  a  node  that  has  file  l. 
The  second  term  denotes  the  cost  of  querying 
file  j  at  node  k  which  was  requested  by  node 
3,  where  node  k  is  the  closest  node  containing 
file  £.  The  third  term  represents  the  cost 
of  storing  file  l  at  node  k.  The  last  term 
denotes  the  cost  associated  with  the  expected 
accessibility  of  file  i  to  the  coenander  at 
node  i  weighted  by  the  importance  of  com¬ 
mander  i. 

The  first  set  of  constraints  state  that 
the  number  of  files  stored  at  any  node  must 
be  less  than  the  capacity  at  each  node.  The 
second  constraint  states  that  the  expected 
time  to  retrieve  a  query  is  less  than  a 
certain  threshold  quantity.  The  last  cons¬ 
traint  states  that  all  the  zero-one  variables 
are  nonnegative. 

If  we  now  examine  the  last  term  in  the 
minimization ,  we  can  simplify  the  expression. 
The  expected  value  may  be  brought  inside  the 
summation.  Since  the  importance  of  the  com¬ 
mander  i  and  the  importance  of  file  l  to  om- 
mander  i  are  not  probabilistic,  we  can  simply 
take  the  expected  value  of  the  accessibility. 
However,  the  expected  value  of  the  accessibil¬ 
ity  is  simply  the  probability  that  commander 
i  can  access  file  ^  given  the  allocation  of 
redundant  copies  of  file  1  in  the  network. 

We  have: 
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I1(R(Mi))  -  6uAt(l, 

(4.2) 

J 

m 

which  denotes  the  accessibility  of  file  1  to 
the  commander  at  node  i  weighted  by  the  im¬ 
portance  of  file  l  to  the  cnmsander  at  node  i. 
The  initial  formulation  is  as  follows: 


(4.4) 
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Where  P^dj)  for  one  file  l  is  by'  definition: 
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which  simplifies  Co  the  following; 
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where  is  the  probability  of  accessibil¬ 


ity  between  nodes  i  and  j . 

Substituting  this  back  into  the  initial 
formulation  we  now  have: 
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such  that 
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If  we  remove  the  constraints,  we  are  mini¬ 
mising  over  disjoint  sets,  so  we  have 

M  M 

min  I  Cdj)  -  I  minC(Ij)  (4.11) 


Ij  i-1 


i-1  I. 


Lets  now  try  to  minC(Ij)  for  a  partic¬ 
ular  i.  The  following  of  theorems  will  set 
bounds  on  how  to  allocate  the  files  and 
determine  when  not  to  allocate  files. 


5.  THEOREMS 


First  lets  look  at  allocating  one  file 
i  optimally  by  placing  redundant  copies  at 
different  nodes,  so  without  loss  of  genrality 


let  I-tj. 


Theorem  I:  If  for  j-l,2,...,n  then 


an  r-node  assignment  cannot  be  less  costly 
than  the  optimal  one-node  assignment  if 


1) 


p-  FT 


(5.1) 


and 

2) 


Y< 


(pr-1-1) (1+p) 


Pr 

N 


I  M- 


(1+0) 


j-l 


3  jl 


K 


k-2 

r 


+  l  minl.d  +  o,+  ±-  Jo. 

»  j-l  k  r  1  pr  k-2  ^ 


such  that 
H 

I  x, .<  b  ,  l±i<N 

j-l  13  1 

‘^ij^j'ijk^ij’  ** 

x>0 

We  know  that 
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Theorem  I  states  that  if  we  have  al¬ 
located  r-1  copies  of  a  file  and  the  two 
inequalities  hold,  then  by  allocating  the 
rth  file,  our  total  cost  will  not  be  less 
than  just  allocating  one  file  optimally. 
This  will  allow  us  to  reduce  the  possible 
solution  space  in  which  the  integer  progran 
must  search.  Now  we  can  exclude  all  al¬ 
locations  with  more  than  r-1  files  from 
possible  file  allocations  before  execution 
of  the  integer  program. 

Theorem  II i  If  for  some  integer  r<n. 


So  substituting  into  the  previous  formula¬ 
tion  we  have; 
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then  any  r-node  file  assignment  is  nore  cost¬ 
ly  than  an  optimal  one-node  assignment. 

Theorem  II  states  that  if  we  have  al¬ 
located  r-1  copies  of  a  file  and  certain 
conditions  hold,  then  by  allocating  the  rth 
file,  our  total  cost  will  be  more  than  al¬ 
locating  one  file  optimally.  This  will  allow 
us  to  reduce  the  possible  solution  space  in 
which  the  integer  program  must  search.  Now 
we  can  exclude  all  allocations  with  more 
than  r-1  files  from  possible  file  allocations 
before  execution  of  the  integer  program. 

Define  u^  as  follows: 

"k  ’  °k  +  \  +  '  <5>5) 

where  Y  «■  v  -  v 

Tk  '  dUk)  T(I) 


Then  the  cost  function  for  any  given  assign¬ 
ment  I  is  given  by: 
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Theorem  III:  If 


C(I)<  C(I-tkl)  for  k-1,2  (5.8) 


then 


C(I~[k))<  C(I-(1,21)  fork-1.2  (5.9) 

Theorem  III  states  that  our  cost  graph  if 
the  given  vertex  has  a  cost  less  than  the 
cost  of  two  vertices  leading  to  it,  then 
the  cost  of  the  predecessor  of  the  two 
vertices  is  greater  than  either  of  the  two 
vertices. 

Theorem  IV:  Given  an  index  set  XQI,  con¬ 
taining  r  elements  with  the  following 
property: 

C(I)<  Cd'(xJ)  for  each  xex  (5.10) 
Then  for  every  sequence  R^1*  ,R*2* ' '  "R  * 

(V ) 

which  are  subsets  of  X,  such  that  R  has 

k  elements  end  R^*3  C  R  ,  the  following 
is  true:. 

C(l)<CU-Ra>)<C(I~R(2>)<CU-Rt3))<..C(I'R<C)) 

(5.11) 

Theorem  IV  states  that  if  a  given  ver¬ 
tex  has  a  cost  less  than  the  cost  of  any 
vertex  along  the  peth  leading  to  it,  then 


the  sequence  of  costs  encountered  along  any 
one  of  these  paths  decrease  monotonically. 
Thus  in  order  to  find  the  optimal  allocatiat 
policy,  it  is  sufficient  to  follow  every 
path  of  the  cost  graph  until  the  cost  in¬ 
creases  and  no  further.  This  will  given  a 
locally  optimum  allocation  of  which  the 
global  optimum  is  one  of  them. 

This  allow  us  to  reduce  the  solution 
space  of  the  integer  program.  Once  we  find 
a  local  optimum  then  we  know  that  any  more 
file  allocation  is  not  required.  Hence  the 
integer  program  will  not  have  to  search  for 
solutions  in  that  part  of  the  solution  space 

Theorem  V:  All  optimal  allocations  will 
include  site  i  if 

A.  min  d.  .  >2  ,  (5.12) 

1  jj*i  13  1 

where 


z. 

i 


N 

ai +  Yi +  £  Vji 


(5.13) 


Theorem  VI:  No  optimal  allocation  including 

more  than  one  site  will  include  site  i  if 
the  following  is  true: 

N 

Z,  >  l  A, (max  d  -d  ) .  (5.14) 

j-1  3  k  3K  31 

Theorem  V  states  that  if  the  cost  of ' 
having  a  local  file  copy  is  smaller  than 
the  smallest  possible  cost  of  sending 
queries  elsewhere,  then  a  local  copy  should 
unquestionably  be  included  in  the  optimal 
allocation.  This  will  require  certain  nodes 
to  have  files  allocated  there.  Therefore 
the  solution  space  required  to  be  searched 
by  the  integer  program  will  be  reduced. 

The  integer  program  can  ignore  any  possible 
file  allocations  that  excludes  the  files 
that  are  unquestionably  allocated  by 
Theorem  V. 

Theorem  VI  states  the  other  extreme 
of  Theorem  V.  If  it  costs  more  to  maintain 
a  local  copy  than  we  could  possible  save  by 
having  one,  then  we  do  not  want  one.  This 
theorem  will  allow  the  integer  program  to 
ignore  allocating  files  in  locations  that 
are  definitely  too  costly  and  therefore 
reduce  the  possible  solution  space  for  the 
integer  progressing  solution.  The  integer 
program  can  ignore  file  allocations  which 
include  file  allocations  that  are  ruled  out 
by  Theorem  VI. 

Define  the  following: 

m.  •  A.  min  d, . ,  (5.15) 

i  ijj4l  -3 

and 

Mi  '  j.jAj  ““X  djk-djil  •  .  (5.16) 

Then  for  each  i  the  real  line  is  par¬ 
titioned  by  mi  and  Mj.  into  three  regions. 


It  2.  falls  into  region  X  Chan  it  should 
unquestionably  ba  included.  It  z4  falls 
into  region  XXX  it  will  be  excluded  unless 
all  2.  fall  in  region  XXX,  then  just  include 
the  largest  one.  If  2^  falls  in  region  XX 
then  it  gust  be  further  considered. 

these  theorems  are  useful  because  now 
the  region  in  which  the  integer  program 
must  search  for  solutions  is  reduced.  He 
can  force  files  to  be  allocated  in  region  X 
and  not  be  allocated  in  region  XXX. 


Theorem  X:  Define  the  following  allocations 
Ul-  It  sites  i  and  k  satisfy 


Vzk  >ll  V“I(W'bl' 


C(I,")>C(I*) 

Theorem  X  states  that  if: 


(5.21) 

(5.22) 


Theorem  VI I:  By  choosing  d^-1,  o^-o^  and  N 

a  completely  connected  network,  than  the  2 . -2.  >  £  1  max(d  ■ -d ..,0]  (5.23) 

cost  function  reduces  to  the  following:  1  *  j-1  3  3*  3 


M  r  N  N  N 

Cdj)-  l  l  <v  ,+  l  l  Ip  + 

1  1-1  |_k«l  i-1  k-1  11  111 

J1Xkl(1-Xk£,_  j1°ieU>iltIt)]  *  (5‘1 


is  satisfied,  then  replacing  site  k  by  site 
i  in  an  allocation  will  increase  the  cost. 

Theorem  XI:  A  site  i  cannot  be  included  in 

any  optimal  allocation  if  there  exists 
another  site  k  in  the  network  such  that 


The  decision  rule  for  the  initial  file  as¬ 
signment  is  x^-1  if: 

Xij  +  ^ij-  V)  +  \cj  k'<i  <5•l8, 

This  theorem  just  states  the  initial 
file  allocation  for  this  special  type  of 
network. 

Theorem  VIXX:  Given  a  node  k  and  a  file  j, 

then  to  store  a  copy  of  j  in  k  leads  to  a 
reduction  of  the  overall  costs  if  the  fol¬ 
lowing  holds: 


Xkj  *  *kl  ”  X  ♦«  *  #j  "  V  (5'19> 

For  allocating  a  new  copy  later,  the 
theorem  states  that  if  allocation  of  a  new 
copy  leads  to  a  cost  decrement  for  the  host 
node  which  is  greater  than  the  overcoat  due 
to  the  necessity  of  updating  the  additional 
copy  and  storage  cost,  then  we  should  store 
the  file  there. 

This  theorem  is  useful  because  the 
solution  to  the  file  allocation  problem 
does  not  require  integer  programming  and 
therefore  is  not  NP  complete.  It  enables 
simple  calculations  to  determine  file 
allocation. 

Theorem  IX:  Define  the  following  allocations: 
I'-XUCkl  and  X"-I'u(i).Xf  site  i  satisfies 


for  scow  site  k  in  the  network,  then 
C(I")  >C(X'). 

Theorem  IX  states  that  if  site  i  is 
sufficiently  costly,  then  by  adding  site  i 
to  an  allocation  which  already  includes  k 
increases  the  total  cost. 


V2k >  l 


Theorem  XX  states  that  instead  of  deter¬ 
mining  that  no  more  than  one  of  some  group 
of  geographically  close  sites  can  be  included 
in  an  optimal  solution.  Theorem  XX  states 
that  certain  sites  may  be  excluded  from  being 
optimal  allocations  by  the  existence  of 
better  nearby  sites.  This  theorem  is  useful 
because  it  allows  us  to  reduce  the  possible 
solution  space  of  the  integer  program.  The 
integer  program  can  ignore  file  allocations 
that  allocate  separate  copies  of  the  same 
file  at  geographically  close  nodes. 

Theorem  IX,  theorem  X,  and  theorem  XX 
are  extensions  of  work  done  by  Grapa  and 
Belford  [6] . 


Theorem  XXI:  The  following  are  equivalent: 


Pt(I)  -  I  n  U-x.)> 


j-1  k-1 

N  N  N 


♦  l  l  n  u-m  >x  x  tp  *u-p  >p  j... 
h-i  j-i  k-i  k  3  h  13  13  lh 


j>h  k|*j 
k^h 


N  N  N  N 

♦I  I  I  n  Kk'pij+a-pij)pih+ 

a-l  b-1  j-1  k-1  *  13  *3  in 

b>a  j>hWj  +  (1-p  )  (1-p  )p.  ..] 

i,a  x3  x9 


3>b  : 

j>a  Wb 

Wa 


and 


-  £  Vij 

N  N  N 


+  (-1) 


N+l 


I  I  n  Via 

j-1  k-1  ami  “  ** 
k>j  T0i 
writ 

„  N  N  N 

(-D  I  n  n  x  p 

j-l  B-l  B  m-1  ”  “ 


nw'j 


(5.26) 


where  p  .  i*  Che  probability  of  accessibility 
between  nodes  i  and  j. 

This  theorem  essentially  states  a 
generalization  of  the  well  known  probability 
law: 


P  (ABC)  -P  (A)  *t  (B)  +P  (C)  -P  (AB)  -P  <BC)  -P  (AC)  +P  (ABC)  . 

(5.27) 
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He  have  formulated  the  file  allocation 
problem  in  a  r  context  where  vulnerability 
is  an  issue.  The  formulation  considers: 

1.  The  probability  of  comander  accessing 
files: 

2.  The  importance  of  coeoanders: 

3.  The  importance  of  particular  files  to 
particular  commanders. 

The  theorems  have  provided  ways  to  cut  down 
on  the  possible  file  allocations  (solution 
space)  In  which  the  integer  program  has  to 
search.  Therefore,  we  reduce  the  amount 
of  time  required  to  solve  for  a  solution 
using  integer  programing. 

He  have  extended  and  proved  twelve 
theorem,  all  applicable  to  the  new  formula¬ 
tion. 

In  the  C3  context,  we  may  not  need  in- 
ger  programming  to  solve  for  a  solution  if 
we  make  the  following  assumptions: 

1.  Connected  network  where  all  nodes  are 
connected  to  each  other: 

2.  Cost  of  communication  is  same  between  all 
nodes: 

3.  Cost  of  storing  a  file  is  the  same. 

In  the  area  of  further  research,  we  plan  to 
explore  the  effects  of  where  the  data  sources 
are  located  on  the  file  allocation  problem. 
This  would  be  applicable  in  a  C3  context, 
where  sensor  data  may  coma  from  only  a  fixed 
set  of  nodes.  The  data  must  also  pass 
through  a  processing  node.  The  location  of 
where  the  processing  node  should  be  is  also 
an  optimization  problem  which  can  be  incor¬ 
porated  into  our  formulation. 
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